Skip to content

Add bulkFetch and documentClassCounts API functions#55

Merged
stevevanhooser merged 3 commits intomainfrom
claude/ndi-cloud-api-porting-wumxY
Apr 21, 2026
Merged

Add bulkFetch and documentClassCounts API functions#55
stevevanhooser merged 3 commits intomainfrom
claude/ndi-cloud-api-porting-wumxY

Conversation

@stevevanhooser
Copy link
Copy Markdown
Contributor

Summary

Add two new document API functions to support efficient bulk document retrieval and document class histogram queries.

Key Changes

  • bulkFetch(): New function to synchronously fetch up to 500 documents by ID via POST /datasets/{datasetId}/documents/bulk-fetch

    • Validates inputs: non-empty list, max 500 entries, each a 24-character hex string
    • Returns list of document dicts with full data
    • Silently omits non-existent, soft-deleted, or mismatched documents
    • Intended for small subsets (e.g., from ndiquery results)
  • documentClassCounts(): New function to retrieve document class histogram via GET /datasets/{datasetId}/document-class-counts

    • Returns flat histogram grouped by leaf data.document_class.class_name
    • Includes fields: datasetId, totalDocuments, and classCounts mapping
    • Missing/empty class names bucketed under 'unknown'
  • Input validation: Added regex pattern _HEX24 to validate 24-character hex document IDs

  • Comprehensive test coverage: Added 8 unit tests covering happy paths, validation errors, and edge cases

  • MATLAB bridge documentation: Updated sync metadata tracking both functions as synchronized with MATLAB main as of 2026-04-20

Implementation Details

  • Both functions use @_auto_client and @validate_call decorators for consistency with existing API wrappers
  • Input validation happens before API calls to fail fast on invalid inputs
  • Response handling gracefully handles missing documents field in bulkFetch by returning empty list
  • Follows existing patterns: delegates HTTP metadata to CloudClient, returns only the data payload

https://claude.ai/code/session_01Wv5mG4qAT66WtQ2NjMQP88

claude added 3 commits April 20, 2026 23:33
Mirrors two new commands added to the MATLAB +ndi/+cloud/+api/+documents
namespace. MATLAB routes them through +implementation wrappers that
normalize output style; the Python port uses CloudClient for the same
role, so no +implementation mirror is needed.

INTERFACE UPDATE: Added bulkFetch and documentClassCounts entries to
src/ndi/cloud/api/ndi_matlab_python_bridge.yaml.

- bulkFetch: POST /datasets/{datasetId}/documents/bulk-fetch; mirrors
  MATLAB input validation (non-empty, <= 500 entries, 24-char hex IDs)
  and returns the 'documents' array.
- documentClassCounts: GET /datasets/{datasetId}/document-class-counts;
  returns the datasetId/totalDocuments/classCounts struct.
The cloud search API no longer exposes document_class.class_name as a
directly searchable field path. Class filtering now has to go through
the 'isa' operator, which also rolls up subclasses. This was causing
test_ndiqueryAll_paginates to return zero documents against the live
server.

Only the two cloud ndiquery tests are affected. Inline document bodies
(e.g. {"document_class": {"class_name": "..."}}) and local
session.database_search calls continue to use the field directly since
they are not cloud search structures.
Replaces the regex-on-document_class.class_name idiom with the
semantic equivalent ndi_query.all(), which is a static factory for
isa('base'). Matches the NDI-matlab ndi.query.all() convention and
avoids relying on the soon-to-be-removed document_class field path.
@stevevanhooser stevevanhooser merged commit cd0346d into main Apr 21, 2026
5 checks passed
@stevevanhooser stevevanhooser deleted the claude/ndi-cloud-api-porting-wumxY branch April 21, 2026 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants